document 



resume 



RE 001 418 



ED 024 528 

By -Rosenshine. Barak 

N<;w Correlates of Readability and Listenability. 

presented .t Internationa, Reeding Association conference, Boston Mess., April 24-27, 

1968. 

EDRS Price MF-S0.25HC-S0.95 ^ . r u P^:.Hinn Difficulty. ♦Reading Level. 

Descriptors- Eveluetion Criteria »Reedebility, Reeding Comprehension, Reedmg D.ft.colty, 9 

♦ Reading Material Selection. Textbook Selection +• n , r.:,vsii:a,r thrOUClh 

Horizontal readability, the analysis of essentially omilar ^ 
classification of words and phrases according mMerial= de=iqned for 

Lent or idea was being presented ^'^^d co^preheno on. ® 

SSliilor, of irrol..*,! soolooc “X^rre^e mIooT Si il *s 

References are listed. (BS) 



REOOl 418 



U. S. DEPARTMENT OF HEALTH, EDUCATION & WELFARE 
OFFICE OF EDUCATION 



THIS DOCUMENT HAS BEEN REPRODUCED EXACTLY AS RECEIVED FROM THE 
PERSON OR ORGANIZATION ORIGINATING IT. POIMiS OF VIEW OR OPINIONS 
STATED DO NOT NECESSARILY REPRESENT OFFICIAL OFFICE OF EDUCATION 
POSITION OR POLICY. 



NEW CORRELATES OF READABILITY AND LISTENABILITY 



o 

o 

UJ 



Barak Rosenshine 



Temple University 



Paper delivered in session on 
Research Reports: Readability 



Thirteenth Annual Convention of the 
International Reading Association 

Boston, Massachusetts 
April, 1968 



ERIC 



NEW CORRELATES OF READABILITY AND LISTENABILITY 



1 



Tliere axe three steps in investigating the correlates of reading 
difficulty. First, investigators order a number of reading materials 
according to a criterion of difficulty, usually comprehension; second, 
they analyze the materials for internal, linguistic factors which predict 
variation in difficulty; and third, they combine the factors which are 
most predictive of difficulty into a multiple regression formula. Research 
by this paradigm has been fruitful, and the findings have been remarkably 
consistent. Almost without exception the studies have shown that difficult 
reading material contains longer words and longer sentences. Recent 
studies which have used refined measurement techniques enable us to com- 
bine a variety of additional linguistic variables into prediction 
equations which correlate .9 or better with the difficulty of the 
passages (1) (2), 

In this type of research, the passages are usually short - ranging 
from 150 to 300 vjords - differ widely in content, and represent a range 
of difficulty of eight years or more. Traditional readability research 
might be labeled vertical studies because the difficulty of the passages 
ranges from high to low. Coleman (2) has pointed out that the predictor 
variables developed from such vertical studies are very useful in dis- 
tinguishing among passages that range widely, such as high school and 



1. The primary intellectual debts in this report are . owed to professors 
N. L. Ga^e (Stanford University) , John Bormuth and Carl Rinne (both 
of the Un iversity of Chicago), E. B. Coleman (University of Texas 
at El Paso) and my wife, Barbara. 



2 . 



grammar school prose. 

But a more difficult problem is the development of measures which 
will distinguish beti^een the effectiveness of essentially similar passages. 
For example, if five 5th grade American History texts are roughly equal 
in letters per word and words per sentence, can tlie teacher assume that 
the texts will be equally comprehensible? If not, then what measures will 
identify the effective and ineffective texts? The development of 
measures which distinguish between similar material might be termed studies 
of horizontal readability . This is a report of three studies of horizontal 

readability. 

In the first study, Peterson (13) took two 950-word passages from 
social studies textbooks - one on feudalism and the other on imperialism - 
and rewrote each of the original passages into an organized version 
and a ’’human interest” version. The subjects ii?ere seventh-grade students. 

A second researcher, Ray Funkhouser, at the Institute for Communica- 
tion Research at Stanford University (7) (8) studied the problems in 
communicating science material to non- scientists. The experimental 
variable in this study was a set of 10 different eight-page articles on 
enzymology written for an audience of non-specialists. Nine of the 
articles were written by professional science writers, and the tenth was 
written by a biochemist. The subjects v^^ere first- and second-year college 

students who were not majoring in science. 

In a third study, (14) (15), 40 twelfth-grade social studies teachers 

gave two 15-minute lectures to their students on successive days: first 

on Yugoslavia and then on Thailand. All lectures were prepared from 
identical material: 2500-word articles taken from the At lanti£ mag azine . 



3 . 



There is some question about including a study of lecturing, but the 
research summaries by Klare (11) and Travers (18) indicate that reada- 
bility and listenability are highly correlated. 

In each of the three studies, all students took a common text based 
on the main ideas of the original material. All three investig'^tors 
adjusted the students' test scores for a measure of student knowledge or 
aptitude, in addition, Funkhouser (7) (8) and Rosenshine (14) adjusted 
the students' posttest scores for the relevance of the material in the 
article they read or the lecture they heard to the items on the common 



test. 

In all three studies, although the passages contained similar 
material and were intended for the same audience, there were significant 
differences in the effectiveness of these passages as measured by the 
students' adjusted comprehension scores. Peterson ( ) found, as she 
predicted, that the students who read the organized version or the human 
interest version had significantly higher adjusted test scores on a ccmmon 
test than those who read the original passages on feudalism and im- 
perialism. the 10 articles on enzeraology. the lectures on Yugoslavia, and 
the lectures on Thailand all differed significantly in their effectiveness 



Results 

,-)hat variables accounted for these differences? First, some nega- 
tive findings, in all of these studies, the traditional readability 
variables and the traditional readability formulas did no t discriminate 
between the high-comprehension producing passages and the low- comprehension 
producing passages. In all three studies, there were no significant 
differences between the passages on measures of word length, word 



4 . 



difficulty, and sentence length. Funkhouser (8) found that three 
readability measures such as the Flesch (5) and the Dale-Chall (3), and 
three versions of the "cloze score" (17) did have moderate and consistent 
correlations with the adjusted measure of comprehension, but none of the 
correlations was statistically significant. In addition, Peterson-s 
materials did not differ in the number of personal pronouns or the number 
of personal sentences. Funkhouser found no significant differences in 
words per paragraph, type-token ratio, percent of lines of analogy, 
percent of lines of definition, and percent of lines of non-science 
material. In the lectures on Yugoslavia and Thailand (14)there were no 
significant differences in the length and structure of independent 
clause units, frequency and proportion of prepositional phrases, and in 
the use of personal reference pronouns , passive verbs , and awkward and 

fragmented sentences » 

There were, however, five promising variables which emerged from 
this research. They are; vagueness, explaining links, frequency of 
examples, the rule-example-rule pattern, and something which might be 

labeled irrelevancy. 

1, vagueness. Page (12), Hiller, and others (9) have developed 
computer programs to count the frequencies of certain stylistic elements 
in essays. They have found that the "essay grades" developed from the 
computer count of these stylistic elements correlated significantly with 
the grades assigned to the same essays by humans (9) (12). One of the 
categories which Hiller developed for the analysis of essays was labeled 
v,,g„eness and defined as a writing style characterized by an excessive 
proportion of qualification, haziness, and ambiguity. 



rt 

1 “ 



o 

ERIC 




5 . 



Hiller, et al. (10) expanded the list of words and phrases taken to 
indicate vagueness and used a computer to count the proportion of vague- 
ness words and phrases in 32 of the Yugoslavia lectures and 23 of the 
Thailand lectures. Hiller, et al. found that the proportion of words 
classified in the subcategory indetermin ate qualifiers and the proportion 
of words classified as probability had significant negative correlations 
with the difficulty of both the Yugoslavia and the Thailand lectures. 
Tndete-nninate qualifiers are words such as "rather," "very." "any number 
of," "more or less." "little," "few," "some." "pretty much." and "quite a 
bit." Probability words include "could be," "might, possibly, 

"sometimes," "more often than not," "may," "usually," "liklihood," and 
"most of the time." The high-scoring lectures, then, had fewer indeterm i na te 

qua lifiers and probability words. 

Hiller’s findings indicate that although the use of short words 

usually correlates positively with reading ease, there are some pretty 
short words which, more often than not, might possibly detract very much 



from readability, more or less. 

2. Explaining Links. In my analysis of the Yugoslavia and Thailand 
lectures (14) (15), I assessed the frequency of explanation by counting 
explaining links , that is, propositions and conjunctions which indicate 
that the cause, result, or means of an event or idea is being presented. 
Explaining links are words and phrases such as because, in order to, 
"if.. .then," "therefore," "consequently," and "by means of," as well as 
specified instances of words such as "since." "by." "through." and "so." 
The high-scoring lectures on Yugoslavia and on Thailand used more of the^e 
explaining links in each of three units of measure: per lecture, per 

minute , and per 100 words . 



6 . 

Hie identification of explaining links may be one step in developing 
a measure of tlie connective ness of material. Words such as these 
explaining links may function to link phrases either within or betwe^ 
sentences so that a phrase or clause containing an explaining link 
elaborates and expands upon another phrase or sentence. This special 
linkage may be illustrated by the following three sentences which are 

almost identical: 

(1) The Chinese dominate Bangkok's economy, they 
are a threat, 

(2) The Chinese dominate Bangkok's economy, b^ they are 
a threat. 

(3) The Chinese dominate Bangkok's econcmiy; therefore , 
they are a threat. 

The third sentence may be the easiest to comprehend because it 
contains the explaining link "therefore" instead of a conjunction such 
as "and" or "but." Different types of explaining links also seem to be 
interchangeable, as in the follov/ing three examples: 

(1) The Chinese dominate Bangkok's economy; therefore , 
they are a threat, (Statement of consequence) 

(2) The Chinese are a threat because they dominate 
Bangkok's economy. (Statement of cause) 

(3) By dominating Bangkok's economy, the Chinese are a 
threat. (Statement of means) 

It should be noted that the explaining links which were counted in 
this study were only a convenience for identifying "explaining sentences." 
There is no claim that the words selected as explaining links represent 
all the words which could be selected. One next step will be to investigat 
this category more closely, eliminating words which are not true 
explaining links and determining whether certain nouns and verbs can be 
included within this category. 



I also counted the number of explaining links in the passages 
developed by Peterson and found that the frequency of explaining links 
did discriminate between the high-scoring and low-scoring passages on 
imperialism, but that the differences were not significant for the 
passages on feudalism. 

3. Examples. Funkhouser (7) found that for the material on 
enzymology, the proportion of lines giving examples was a significant 
positive correlate of effectiveness. In the lectures on Yugoslavia and 
Thailand the number of examples was not significant. Such a contrast, 
makes sense because enzymology appears to be a more difficult topic to 
the reader than the political and economic affairs of Yugoslavia, and 
because the science articles were rated by the Flesch reading ease formula 
as more difficult than the social studies lectures. These results 
appear to indicate that the frequency of examples becomes more important 
as the conceptual difficulty of the material increases. 

4. Rule and Example Pattern. Although the high-scoring lectures 
on Yugoslavia and Thailand did not contain more examples or sections of 
examples, they differed from the low-scoring lectures in the pattern of 
examples. The high-scoring lectures used a summarizing rule twice, both 
before and after a series of examples (15). 

For example, vjhen the high-scoring lecturers were discussing 
Yugoslavia* s problems with inflation, they would begin with a general 
statement such as "Tito is attempting to deal with the problems of 
inflation," or "They are attempting to curb inflation.” They would follow 
this general statement with a number of examples and close by re-stating 
the general statement using sentences such as "So you can see that they 
have a problem with inflation." Some high-scoring lecturers restated 



8 . 



the principle indirect ly by beginning the next sentence with ”In addition 
to the problem of inflation, Yugoslavia also..." In contrast, the low- 
scoring teachers used only one summary statement, usually before the 
series of examples. 

These results indicate that a pattern which presents a structuring 
statement first, follows it with details, and concludes with a structuring 
statement is more effective than either an inductive or a deductive 
pattern of explanation. An extension of this idea might be that some 
paragraphs would be more effective if they began and ended with a topic 
sentence . 

5. Irrelevancy. Although Funkhouser found that increased redundancy 
of examples is a positive correlate of effectiveness, his results also 
suggest that not all redundancy is useful because in his study, the 
number of lines relevant to each test item had a negative correlation 
with effectiveness. That is, the high-scoring articles had fewer lines 
related to an item on the test. Although this is surprising, it is not 
an isolated finding. Desirato, et al. (4) reduced tlie length of 
lectures, and Fletcher (6) reduced tlie length of film commentary by 
eliminating digressions and irrelevancy. In both cases the reduction in 
material resulted in significantly increased comprehension as m^.asured by 
test scores. 

The existence of irrelevancy will complicate future research in this 
area because irrelevancy will not be identified using the current reada- 
bility formulas. In all three studies the number of words per sentence 
i^ras not a significant correlate. So irrelevancy expresses itself by 
extra sentences, not by extra long sentences. Irrelevant material may 
also contain short words, an abundance of explaining links, and even 



paragraphs which use the rule-example- rule pattern. 



Inpli cations 

There are two general conclusions which can be drawn from these 
three horizontal or limited- range studies. The first is that when 
relatively long passages deal with similar material and are intended for 
the same audience, the passages still differ in their difficulty or 
comprehensibility. However, in these cases the readability formulas 
are not particularly valid measures for distinguishing between the 

effective and ineffective passages. 

The second conclusion is that the measures whicli have been related 
to the effectiveness of similar passages were developed by focusing upon 
the cognitive function of key words and phrases. The words and phrases 
in eadi of the significant findings are not linguistically or structurally 
similar. For example, the words classified under indetermimate qualifiers 
include such structurally different words Jis "few," "any great extent," 
and "more or less." Yet, these words share the cognitive function of 
being vague qualifiers. The words classified as explaining links include 
"because," "therefore," "by means of," and "in order to." Although 
these words are structurally dissimilar, they do have a cognitive 
similarity: they all introduce a clause or phrase which states a means, 

reason, or consequence for the idea expressed in another clause. The 
results are the same in the case of examples. We would be hard put to 
find a structural difference between a sentence giving an example, and a 
sentence which introduces or summarizes the topic. These results suggest 
that in the study of similar passages, attempts to discriminate beti^reen 
effective and ineffective presentations by counting only different parts 



10 . 



of speech may have limited promise. More significant results may be 
obtained by also focusing upon the cognitive function of words and 
phrases. 

The suggestion that we consider the cognitive as well as the 
linguistic function of words and phrases might also be developed from 
the research of Coleman (2) and Bormuth. In Coleman's research, adverbs 
of time and location are classified separately from other adverbs. 

Coleman found that such a distinction has research merit. But Coleman's 
categories require separate classifications for the adverb "now" and for 
the prepositional phrase "at the present time ;" separate classifications 
for "usually" and for "most of the time." Bormuth (personal communication) 
has suggested that all four examples be classified together as time 
adverbials. If the proportion of time adverbials in a communication is 
a significant correlate of reading ease, then we may obtain better 
research results if we classify aU words dealing with time and location 
together, regardless of their structural use. For example, if the words 
"yesterday" and "today" receive special attention when they are used as 
adverbs, perhaps these words should receive the same attention when they 
are used as nouns. 

The cognitive approach to the study of reading difficulty suggests 
not only combinations, but also new divisions of structural categories. 

In Coleman's report, he cites the following eight words as examples of 
the subclass predeterminers : each, all, both, half, any, some, most, 

and few. The first four words are specific, the last four words - any, 
some, most, and few - were among the words selected by Hiller as 
indeterminate qualifiers. In future studies, spearate classifications o£ 



predeterminers into spe cific and indeterminate may yield productive 



11 . 



results. But for such research we will have to use passa^^es lonj^er than 
200 words. 

Conclusions 

Tliis new research, horizontal studies of readability, consists of the 
analysis of essentially similar passages? and focuses, upon classifying 
words and phrases according to their cognitive similarity. This research 
has produced promising potential correlates of reading difficulty, such 
as vagueness words, explaining links, redundancy of examples, and the 
rule-example- rule pattern. Experimental research will be necessary to 
clarify these findings. Tliis experimental research could involve 
inserting and deleting explaining links and vagueness words from selected 

passages. 

It is too early to make definitive recommendations at this point, 
but vjhen teachers have to choose among similar texts with relatively 
similar readability levels, the teacher might be aided in his choice by 
counting the proportion of vagueness words, rule-example-rule pattern, 

£.nd ©xpl&ining links in th 0 S© texts 

Future research in this area will be far from simple. In both 
experimental and correlational studies some of these results will fail 
to replicate, and new variables will emerge that will be even more 
bewildering. There will doubtless be complex interactions among cognitive 
and linguistic variables, relationships with effectiveness which are not 
linear, and interactions and correlations which change as the material 
becomes more complex. There is a need for more horizontal studies such 



as these. 



References 



Bonnuth, J. R. Readability: A new approach. Reading Research 

Quarterly j 1966, 1, 79-132. 

Coleman, E. B. Developing a Technology of V/ritten Instruction: 

Some Deteiminers of the Complexity of Prose. In E. Z. 

Rothkopf (Ed.) Verbal Learning Research and a Technology of 
Written Instruction . Rand McNally, in press. 

Dale, E. and Chall, J. S. A formula for predicting readability. 
Educational Research Bulletin ^ 1948, 27, 11-20 

Desiderate, 0. L. , Kanner, J. H,, and Runyon, R. P. Procedures 
for improving television instruction. Audio-Visual 
Communication Review ^ 1956, 4, 57-63. 

Flesch, R. F. A new readability yardstick. Journal of Applied 
Psychology ^ 1948, 32, 221-233. 

Fletcher, R. M. Profile analysis and its effect on learning when 
used to shorten recorded film commentaries . Technical Report 
SDC 269-7-55. Port Washington, Long Island, N.Y. U.S. Naval 
Special Devices Center, 1955. 

Funkhouser, G. R. Communicating science to non-scientists. Paper 
presented at the meeting of the American Association for Public 
Opinion Research, Lake George, New York, 1967. 

Funkhouser, G. R. Readability in science writing. Paper presented 
at the meeting of the Association for Education in Journalism, 
Boulder, Colorado, 1967. 

Hiller, J. H. and Marcotte, D. R. A computer search for the traits 
of opinicnation, vagueness, and specificity-distinctiveness 
in student essays. Paper presented at the meeting of the 
American Psychological Association, Division 5, Washington, 1967 

Hiller, J. H., Fisher, G., and Kaess, W. A computer investigation 
of characteristics of teacher lecturing behavior. Paper 
presented at the meeting of the American Educational 
Research Association, Chicago, February, 1968. 

Klare, G. R. The measurement of readability ^ Ames, Iowa: Iowa 

State University Press, 1963. 

Page, E. B. The imminence of grading essays by computer. Phi Delta 
Kappon , 1966, pp. 238-243. 

Peterson, E. M. Aspects of readability in the social studies . 

Bureau of Publications, Teachers College, Columbia University, 
New York: 1954. 



Rosenshine, B. Objectively measured behavioral predictors of 

effectiveness in explaining. Paper presented at the meeting 
of the American Educational Research Association, Chicago, lybS. 

Rosenshine, B. Behavioral predictors of effectiveness in explaining 
social studies material. Unpublished doctoral dissertation, 
Stanford University, 1968. 

Strunk, W., Jr, and E. B. The elements of style^ New York: 

Macraillian, 1965. 

Taylor, W. "Cloze procedure:" A new tool for measuring readability. 
Journalism Quarterly , 1953, 30, 415-433. 

Travers, R. M. W. et al. Research and theory related to audiovisual 
information transmission. Interim Report, U.b. Department of 
Health, "Education, and welfare; Office of Education Contract 
No. 3-20-003. Salt Lake City, Utah: University of Utah, 

Bureau of Educational Research, 1964. 






Additional Examtiles of tb ° "f Explainlnp; Links 

The four examples presented below are additional, and 

perhaps more dramatic, illustrations of the use of explaining 
links. Each example is the material which a different 'teacher 
used to cover one of the given criterion questions in the Thailand 
lecture. The original question, which all teachers were giv 

was: 

Chinese citizens of Thailand, deemed by many as 
being prone to Conimunist subversion, 

a) dominate the country’s commerce. 

b) are a small minority. .. 4.-1 

c) are many in number but have little 

government influence. 

d) make up most of the lower-economic 

population. 

The correct response is 8t. 



The four excerpts below are taken from two high-scoring 
lectures and two low-scoring lectures. The explaining links are 
underlined. The purpose of the four examples is to show the 
stylistic differences between the high and low lectures. All 
four examples cover the criterion question; in fact, all four 
almost quote the correct response. But within their presentation, 
the two high-scoring lecturers are using explaining links whereas 
the two low-scoring lecturers are not. 



Teacher IT 

They give an example of one of 
Thailand faces. They had a recent offering ^25^^ 

scholarships to anyone in the country. Thais 

20 out of 25, 23 were Chinese and 2 were native Thais, 



Kow this Ehov/s one thing, that 

though they are a minority, dominate the country s 
business and commerce. And this is 
think, generally throughout 
Bangkok’s population of two and a half 
about half are Chinese. That’s *^® 

not the country. These Chinese, as you migh 
Wne, havLg a stronger linic to their homeland, 
native homeland China, are more susceptible 
communism than the native Thais. 

Teacher 2^T 

’ However, there are some other large groups. 

Spread throughout China, through Thail^ citv^of 
large number of Chinese. In the captal ci y 
Bangkok, about half the population is 

That means approximately one million mixed 

are Chinese. These Chinese have not really mixea 

“ll ^th the population, ^*‘'® J^Sd^rsfar"’ 
and places they’ve been. They’ve tended ^ 

to tLmselves, as the overseas 

Asian countries. They also have t^ ".ded to work 
in certain areas. For example, in commerce, md 
a great extent they dominate commerce, 
thev’re the ones who are the busxnessmen, traders, 
^hfb^ers, in Thailand. And in this sense, they 
nrobaWY represent the wealthiest ethnic group xn 
?h:lfaiL ?hat is to say, in t-ms of money they 
•would be the upper crust rather than the Thai 

people. 



Teacher 12T 

There are some areas of unrest even though 
the picture looks rosy so far. The Chxnese 
to dLinate the commerce of the country. In 
Bangkok itself, over half 

Chinese and they take care of most of the com 
merciat’p“oticeL And this is a bad thing xn 
relation to the future of Thailand because ohe 
5haf people themselves can’t ?-duoe leaders »d 

they Lve to, you might say, 

influence of the Chinese, because of thexr 

commercial wealth, is felt xn 

of course then is another area of tenderness 

because the Chinese seem to be more 

^SSSJdst way of thinking, f^ey.®®®® J^Le 

militant than the easy-goxng Thax who are, b^ 

religion, kinda passive and easy-goxng. 



Teacher 3^ 



Now the people here are Thai, but they have 
problems with other Asiastic peoples. One of 
these particular problems is vn.th the Chinese. 

The Chinese in Bangkok in this area are t e 
dominant force in the economy. They are i 
control of the economy, industries that they 
have, the shops, mxd so forth. The businessman 
in Thailand in this area, the central part, is 
Chinese. The Thai people are Buddhist, easy- 
going, sort of relaxed, not too ambitious. The 
Chinese are very ambitious and want to get ahead, 
so they dominate. Recently, when they had a 
^mpany offer the Thai people 25 scholars^ps, 
and the scholarships were competxtive, 
the 25 scholarships, 25 were won by the Chines 
minority and two were won by the Thai 
a simple competitive examination. ^ this is 
one of their problems. The Thai people ^e ea y- 
Koing and the Chinese are dominant, but the 
problems here with the Chinese is that they can 
be influenced by Communist China ^ i 
that the Chinese here are susceptible to iniiu- 
ence from Communism, especially from eir 
mother country. Communist C^na. ^o 
are upset by the fact that the Chinese are more 
this way. The Thais themselves becau se of their 

Buddhist faith and the fact thL 

out that Communism is not too attractive to , 

so they’re not as willing to fall for the 
Co^mmunist line. 



